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FY7 POLYMERASE 

CROSS-REFERENCE TO RELATED APPLICATIONS 
This application claims priority to United States Provisional Application Serial No. 
60/089,556, filed on June 17, 1998, the entire disclosure of which is incorporated in its 
herein. 

BACKGROUND OF THE INVENTION 

Field of the Invention 

The instant disclosure pertains to theimostable DNA polymerases which exhibit 
improved robustness and efficiency. 
Background 

DNA polymerases are enzymes which are useful in many recombinant DNA 
techniques such as nucleic acid amplification by the polymerase chain reaction ("PCR"), self- 
sustained sequence replication ("3SR"), and high temperature DNA sequencing. 
Thermostable polymerases are particularly useful. Because heat does not destroy the 
polymerase activity, there is no need to add additional polymerase after every denaturation 
step. 

However, many thermostable polymerases have been found to display a 5' to 3' 
exonuclease or structure-dependent single-stranded endonuclease ("SDSSE") activity which 
may limit the amount of product produced or contribute to the plateau phenomenon in the 
normally exponential accumulation of product. Such 5* to 3' nuclease activity may 
contribute to an impaired ability to efficiently generate long PCR products greater than or 
equal to lOkb, particularly for G+C rich targets. In DNA sequencing applications and cycle 
sequencing applications, the presence of 5* to 3' nuclease activity may contribute to a 
reduction in desired band intensities and/or generation of spurious or background bands. 

Additionally, many of the enzymes presently available are sensitive to high salt 
environments, a condition commonly 

Presently available enzymes have so-so processing ability (are more distributive - fall off 
more often - explain in more detail) 

dTTP added to address compression problems - usually kills activity of enzyme 

Thus, a need continues to exist for an improved DNA polymerase having increased 
tolerance to high salt conditions, efficient utilization of dTTP, high productivity, and 
improved performance on GC-rich templates. 
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BRIEF SUMMARY OF THE INVENTION 
The instant disclosure teaches a purified recombinant thermostable DNA polymerase 
comprising the amino acid sequence set forth in Figure 1 , as well as a purified recombinant 
thermostable DNA polymerase which exhibits at least about 80% activity at salt 
concentations of 50 mM and greater. The instant disclosure further teaches a purified 
recombinant thermostable DNA polymerase which exhibits at least about 70% activity at salt 
concentrations of 25 mM and greater, and a purified recombinant thermostable DNA 
polymerase having a processivity of about 30 nucleotides per binding event. 

The instant disclosure also teaches an isolated nucleic acid that encodes a thermostable 
DNA polymerase, wherein said nucleic acid consists of the nucleotide sequence set forth in 
Figure 1 , as well as a recombinant DNA vector that comprises the nucleic acid, and a 
recombinant host cell transformed with the vector. 

The instant disclosure also teaches a method of sequencing DNA comprising the step 
of generating chain terminated fragments from the DNA template to be sequenced with the 
DNA polymerase in the presence of at least one chain terminating agent and one or more 
nucleotide triphosphates, and determining the sequence of said DNA from the sizes of said 
fragments. The instant disclosure also teaches a kit for sequencing DNA comprising the 
DNA polymerase. 

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS 

FIGURE 1 depicts the amino acid sequence (and DNA sequence encoding therefor) 
for the FY7 polymerase. 

FIGURE 2 depicts the DNA sequence of M13mpl8 DNA sequenced using the FY7 
polymerase formulated in Mn conditions, as shown by a print out from an ABI model 377 
automated fluorescent DNA sequencing apparatus. 

FIGURE 3 depicts the DNA sequence of M13mpl8 DNA sequenced using the FY7 
polymerase formulated in Mg conditions, as shown by a print out from an ABI model 377 
automated fluorescent DNA sequencing apparatus. 

FIGURE 4 depicts the percent of maximum polymerase activity for Thermo Sequenase 
enzyme DNA polymerase versus FY7 DNA polymerase under varying_KCl concentrations. 
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FIGURE 5 depicts the effect of high salt concentrations on DNA sequencing ability in 
radioactively labeled DNA sequencing reactions using Thermo Sequenase enzyme DNA 
polymerase versus FY7 DNA polymerase. 

FIGURES 6-10 depict the effect of increasing salt concentration on the performance of Thermo 
Sequenase. At concentrations as low as 25mM data quality is affected with the read length being 
decreased from at least 600 bases to about 450 bases. At SOmM salt the read length is further 
decreased to about 350 bases, 75mM to about 250 bases and at lOOmM the read length is negligible. 

FIGURES 11-15 depict the effect of increasing salt concentration on the performance of FY7 
DNA polymerase. There is no detrimental effect on performance to at least 75mM KG and only a 
slight decrease in data quality at lOOmM KCL 

FIGURE 1 6 depicts the processivity measured for Thermo Sequenase DNA polymerase, 
AmpliTaq FS DNA polymerase, compared with the processivity measured for FY7 DNA 
polymerase. 

FIGURE 17 depicts the improved read length obtained when using FY7 polymerase 
versus Thermo Sequenase DNA polymerase in radioactively labeled sequencing reactions 
incorporating the dGTP (Guanosine triphosphate) analog dITP (Inosine triphosphate) at 72 
°C. 

FIGURES 1 8-22 show the effect of increasing extension step time on the read length and data 
quality produced by Thermo Sequenase DNA polymerase in fluorescently labeled terminator DNA 
sequencing reactions 

FIGURES 23-27 show the effect of increasing extension step time on the read length and data 
quality produced by FY7 DNA polymerase in fluorescently labeled terminator DNA sequencing 
reactions. 

DETAILED DESCRIPTION OF THE INVENTION 
A series of polymerase mutants were constructed with the aim of obtaining an 
improved polymerase for DNA sequencing, by reducing the exonuclease activity found in 
full length Thermus thermophilic and Thermus aquaticus DNA polymerase I enzymes. Six 
conserved motifs (Gutman and Minton (1993) Nucleic Acids Research 21, 4406 - 4407) can 
be identified in the amino-terminal domain of pol I type polymerases, in which the 5' to 3' 
exonuclease activity has been shown to reside. Further, six carboxylate residues in these 
conserved regions have been shown in a crystal structure to be located at the active site of the 
exonuclease domain of Thermus aquaticus DNA pol I (Kim et al., (1995) Nature 376, 612- 
616). Point mutations were made by site-directed mutagenesis to carboxylates and other 
residues in three of six conserved motifs in Tth and Taq polymerases as follows: 
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Taq D18A, Taq T140V, Taq D142N/D144N. All of these have the mutation F667Y outside 
of the exonuclease domain. 

Tth Dl 8A, Tth T141V, Tth D143N/D145N. All of these have the mutation F669Y outside of 

the exonuclease domain. 

All polymerases were evaluated for exonuclease activity, processivity, strand 

displacement, salt tolerance, thermostability, and sequencing quality. One FY7 polymerase, 
Tth D18A, F669Y, is described in further detail below. 

EXAMPLES 

Methods 

In vitro mutagenesis 

PCR was employed to introduce an aspartic acid to alanine amino acid change at 
codon 18 (D18A) of cloned full length F669Y Tth (plasmid pMRlO). Mutagenic Primer 1 
(CTGTTCGAACCCAAAGGCCGTGTCCTCCTGGTGGCCGGCCACCAC) spans 
nucleotides 19-60 of pMRlO including codon 18 and a&/BI restriction site. Oligonucleotide 
Primer 2 (GAGGCTGCCGAATTCCAGCCTCTC) spans an £coRI site of pMRlO. pMRlO 
was used as template DNA. The PCR product was digested with BstBl and £coRI and 
ligated to two fragments of pMRlO: a 5000 bp KprAI BstBl and a 2057 bp £coRI / Kpnl, 
creating plasmid pMR12. Cells of E. coli strain DHR + were used for primary 
transformation, and strain M5248 (X cI857) was used for protein expression, although any 
comparable pair of E. coli strains carrying the cl + and cl857 alleles could be utilized. 
Alternatively, any rec* cl + strain could be induced by chemical agents such as nalidixic acid 
to produce the polymerase. 
Purification of Polymerase 

M5248 containing plasmid pMR12 was grown in one liter of LB medium (1% 
tryptone, 0.5% yeast extract, 1% NaCI), preferably 2X LB medium, containing 100 mg/ml 
ampicillin at 30°C. When the OD 6 oo reached 1.0, the culture was induced at 42°C for 1.5 
hours. The cultures were then cooled to <20°C and the cells harvested by centrifugation in a 
Sorvall RC-3B centrifuge at 5000 rpm at 4°C for 15 to 30 minutes. Harvested cells were 
stored at -80°C. 

The cell pellet was resuspended in 25 ml pre-warmed lysis buffer (50 mM Tris-HCI 
pH 8.0, 10 mM MgCh, 16 mM (WDjSO^ 1 mM EDTA, 0.1%, preferably 0.2% Tween 20, 
0.1%, preferably 0.2% NP40). Preferably, the lysis buffer contains 300 mM NaCI. 
Resuspended cells were incubated at 75 - 85°C for 10-20 minutes, sonicated for 1 minute, and 
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cleared by centrifugation. The cleared lysate was passed through a 300 ml column of 
diethylaminoethyl cellulose (Whatman DE 52) equilibrated in buffer A (50mM Tris-HCl pH 
8.0, ImM EDTA, 0.1% Tween 20, 0.1% NP40) containing lOOmM, preferably 300 mM 
NaCl. Fractions were assayed for polymerase activity, and those demonstrating peak 
polymerase activity were pooled, diluted to 50 mM NaCl with Buffer A, and loaded onto a 
heparin sepharose column (20 ml) equilibrated with 50 mM NaCl in buffer A. The 
polymerase was eluted from the column with a linear salt gradient from 50 mM to 700mM 
NaCl in buffer A. Fractions were assayed for polymerase activity, and those demonstrating 
peak activity were pooled and dialyzed against final buffer (20mM Tris-HCl pH8.5, 50 % 
(v/v) glycerol, O.lmM EDTA, 0.5% Tween 20, 0.5% NP40, ImM DTT, lOOmM KC1). The 
purified protein is designated FY7. The amino acid sequence (and DNA sequence encoding 
therefor) are presented in Figure 1 . 
Bacterial Strains 

E coli strains: DHlk* [gyrA96, recAl, relAL endAl, thi-1, hsdR17, supE44, X*]\ 
M5248 [k (bio275, cI857, clll+, N+, X (HI))]. 
PGR 

Plasmid DNA from E coli DHR + (pMRlO) was prepared by SDS alkaline lysis 
method (Sambrook et al., Molecular Cloning 2 nd Ed. Cold Spring Harbor Press, 1989). 
Reaction conditions were as follows: 10 mM Tris-HCl pH 8.3, 50 mM KC1, 1.5 mM MgCl 2 , 
0.001% gelatin, luM each primer, 2.5U Taq polymerase, per 100 \xl reaction. Cycling 
conditions were 94°C 2 minutes, then 35 cycles of 94°C 30s, 55°C 30s, 72°C 3 minutes, 
followed by 72°C for 7 minutes. 

Example 1 Formulation of the enzyme in M n conditions 

In the following "pre-mix" protocol, all the reagents are contained in two solutions; 
reagent mix A and reagent mix B. 
Reagent Mix A 

The following reagents were combined to make 10 ml of reagent mix A: 
2.5 ml 1 M HEPPS N-(2-hydroxyethyl) piperazine-N'-(3-propanesulfonic acid), pH 8.0 
500 \x\ 1 M tartaric acid, pH 8.0 
50,000 units FY7 DNA polymerase 

1 unit Thermoplasma acidophilum inorganic pyrophosphatase 
100 nl lOOmMdATP 
lOOjil 100 mMdTTP 



WO 99/65938 g PCT/US99/13741 

100 ^llOOmMdCTP 
500 \il 100 mM dITP 

9.375 ^1 100 \iM C-7-propargylamino-4-rhodamine-6-G-ddATP 
90 |tl 100 \iM C-5-propargylamino-4-rhodamine-X-ddCTP 
6.75 |il 100 \M C-7-propargylamino-4-rhodamine-110-ddGTP 
165 \il 100 \xM C-5-propargylamino-4-tetramethylrhodamine-ddUTP 

10 nl 50 mM EDTA 
1 ml glycerol 

The volume was made up to 10,000 |xl with deionized H 2 0. 
Rea pent Mix B 

The following reagents were combined to make 10 ml of reagent mix B: 
10 m-1 1M MES 2-(N-morpholino)ethanesulfonic acid, pH 6.0 

200 |xl IMMgCh 
75 ^1 lMMnS0 4 

The volume was made up to 10,000 \xl with deionized H 2 0. 
Example 2: Use of the formulation from Example 1 

Two (2) nl reagent mix A, 2 [i\ reagent mix B, 200 ng M13mpl8 DNA, 5 pmole of 
primer (Ml 3 - 40 Forward 5'-GTTTTCCCAGTCACGACGTTGTA), and deionized water to 
a total volume of 20 ul were mixed together and subjected to 25 cycles of (95 °C 30 seconds, 
60 °C 1 minute) in a thermal cycler. After cycling, 4 ul of a solution which contained 1.5 M 
sodium acetate, 250 mM EDTA was added. The solution was mixed and 4 volumes (100 \xl) 
of ethanol added. The DNA was precipitated by incubation on ice for 15-20 minutes 
followed by centrifiigation. The supernatant was removed and the pellet was washed with 
70% ethanol, dried and resuspended in 4 ^1 of formamide containing loading dye. The 
resuspended DNA was then run on an automated fluorescent DNA sequencing apparatus 
(ABI model 377 instrument). The print out from the machine of the DNA sequence is shown 
as Figure 2. 

Example 3 Formulation of the enzvm e in Me conditions 

In the following "pre-mix" protocol, all the reagents are contained in one solution. 
Seq uencing premix 

The following reagents were combined to make 800 ^1 of Sequencing premix 
200 ]xl of 500 mM Tris-HCl pH 9.5, 20 mM MgCh 
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100 nl 40 units/nl FY7 DNA polymerase, 0.0008 units/*il Thermoplasma acidophilum 

inorganic pyrophosphatase 
100 \x\ 10 raM dITP, 2 mM dATP, 2 mM dTTP, 2 mM dCTP 
1 00 \x\ 0. 1 25 nM C-7-propargylamino-4-rhodamine-6-G-ddATP 
100 ^il 1.2 \xM C-5-propargylajnino-4-rhodamine-X-ddCTP 
100 jil 0.09 \M C-7-propargylamino-4-rhodamine-l 10-ddGTP 
100 |ii 2.2 \M C-5-propargylamino-4-tetramethylrhodamine-ddUTP 
Example 4 Use of the formulation from example 3 

Four (4) \x\ of sequencing premix, 200 ng M13mpl8 DNA, 5 pmole of primer (M13 - 
40 Forward 5'- GTTTTCCCAGTCACGACGTTGTA), and deionized water to a total volume 
of 20 |il were mixed together and subjected to 25 cycles of (95 °C 30 seconds, 60 °C 2 
minutes) in a thermal cycler. After cycling, 7 ^1 of 7.5 M ammonium acetate was added. The 
solution was mixed and 4 volumes (100 of ethanol added. The DNA was precipitated by 
incubation on ice for 15-20 minutes followed by centrifugation. The supernatant was 
removed and the pellet was washed with 70% ethanol, dried and resuspended in 4 \x\ of 
formamide containing loading dye. The resuspended DNA was then run on an automated 
fluorescent DNA sequencing apparatus (ABI model 377 instrument). The print out from the 
machine of the DNA sequence is shown as Figure 3. 

Example 5 Polymerase Activity versus Salt Concentration (KC H for Thermo Sequenase™ enzyme 
and FY7 enzvme . 

The percent of maximum polymerase activity was measured for Thermo Sequenase 
enzyme DNA polymerase and FY7 DNA polymerase under varyingJCCl concentrations. The results 
are depicted in Figure 4. The data indicate that FY7 has a much higher salt optimum as well as 
broader range of tolerance for salt in the reaction mixture than Thermo Sequenase The salt 
concentration which gives 50% activity is five-fold higher for FY7 than for Thermo Sequenase. 

The effect of high salt concentrations on DNA sequencing ability in radioactively labeled 
DNA sequencing reactions was also examined. The results are presented in Figure 5. At KC1 
concentrations of 50mM or higher Thermo Sequenase™ polymerase performance degrades to levels 
at which usable data cannot be extracted. FY7 DNA polymerase, however, is able to give quite good 
sequencing data at concentrations of KC1 of lOOmM. 
Example 6 Fluorescent Sequencin g Salt Tolerance 
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These experiments examined the effect of the above-demonstrated polymerase activity in 
high salt concentrations on DNA sequencing ability in fluorescently labeled terminator DNA 
sequencing reactions. The results are presented in Figures 6-15. 

Figures 6-10 show the effect of increasing salt concentration on the performance of Thermo 
Sequenase. At concentrations as low as 25mM data quality is affected with the read length being 
decreased from at least 600 bases to about 450 bases. At 50mM salt the read length is further 
decreased to about 350 bases, 75mM to about 250 bases and at lOOrnM the read length is negligible. 

Figures 11-15 show the effect of increasing salt concentration on the performance of FY7 
DNA polymerase. There is no detrimental effect on performance to at least 75mM KC1 and only a 
slight decrease in data quality at lOOmM KG. 

As it is recognized that some types of DNA preparations may be contaminated with salt 
(which is detrimental to DNA sequencing data quality), the use of FY7 DNA polymerase allows for 
a more robust sequencing reaction over a broader range of template conditions. 
Example 7 Polymerase Processivitv 

The processivity (number of nucleotides incorporated per DNA polymerase binding event) has been 
measured, for different DNA sequencing polymerases. The results are presented in Figure 16. 
Thermo Sequenase DNA polymerase has a processivity of only -4 nucleotides per binding event. 
AmpliTaq FS DNA polymerase has a processivity of -1 5 nucleotides per binding event. FY7 DNA 
polymerase has a processivity more than seven-fold greater than Thermo Sequenase DNA 
polymerase and -two-fold greater than AmpliTaq FS DNA polymerase at -30 nucleotides per 
binding event. 

Example 8 Polymerase Extension with dITP at 72 °C 

The series examined improved read length obtained when using FY7 polymerase 
versus Thermo Sequenase DNA polymerase in radioactively labeled sequencing reactions 
incorporating the dGTP (Guanosine triphosphate) analog dITP (Inosine triphosphate) at 72 
°C. The results are presented in Figure 17. FY7 is able to incorporate >50-100 more 
nucleotides under standard 33 P[a-dATP] sequencing conditions than Thermo Sequenase. 
Example 9 Effect of Extension Step Time on Length of Read 

These series of experiments examined the effect of increasing extension step time of the read 
length and data quality of Thermo Sequenase and FY7 DNA polymerases in fluorescently labeled 
terminator DNA sequencing reactions. The results are presented in Figures 1 8-27. 

Figures 1 8-22 show the effect of increasing extension step time on the read length and data 
quality produced by Thermo Sequenase DNA polymerase. This data shows that a minimum of a two 
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minutes extension step is required by Thermo Sequenase in order to achieve a quality read of at least 
600 bases. Signal strength generally increases to a maximum at a four minute extension (the time 
specified in the commercial product utilizing this enzyme and method). 

Figures 23-27 show the effect of increasing extension step time on the read length and data 
quality produced by FY7 DNA polymerase. This data shows that a minimum of a 30 second 
extension step is required by FY7 in order to achieve a quality read of at least 600 bases. Signal 
strengths plateau at about one minute extension time. The FY7 DNA polymerase can produce data 
of equivalent quality to Thermo Sequenase in one-quarter to one-half the time of extension reaction. 

Although the above examples describe various embodiments of the invention in 
detail, many variations will be apparent to those of ordinary skill in the art. Accordingly, the 
above examples are intended for illustration purposes and should not be used in any way to 
restrict the scope of the appended claims. 
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What is claimed is: 

1 . A purified recombinant thermostable DN A polymerase comprising the amino acid 
sequence set forth in Figure 1. 

2. A purified recombinant thermostable DNA polymerase which exhibits at least about 80% 
activity at salt concentations of 50 raM and greater. 

3. A purified recombinant thermostable DNA polymerase which exhibits at least about 70% 
activity at salt concentrations of 25 mM and greater. 

4. A purified recombinant thermostable DNA polymerase having a processivity of about 30 
nucleotides per binding event. 

5. An isolated nucleic acid that encodes a thermostable DNA polymerase, wherein said 
nucleic acid consists of the nucleotide sequence set forth in Figure 1 

6. A recombinant DNA vector that comprises the nucleic acid of Claim 3. 

7. The recombinant DNA sequence of Claim 4 comprising the plasmid pMRl 0. 

8. A recombinant host cell transformed with the vector of Claim 5. 

9. The recombinant host cell of Claim 6 that is E. coli. 

10. The recombinant host cell of Claim 7 which is E. coli carrying the cl + and cI857 alleles. 

11. The recombinant host cell of Claim 7 selected from the group consisting of DHR + 
[gyrA96, recAl, relAl, endAl, thi-1, hsdR17 ? supE44, and M5248 [X (bio275 ; cI857, 
cIII+, N+ ; MH1))]. 

12. Method of sequencing DNA comprising the step of generating chain terminated 
fragments from the DNA template to be sequenced with the DNA polymerase of Claim 1 
in the presence of at least one chain terminating agent and one or more nucleotide 
triphosphates, and determining the sequence of said DNA from the sizes of said 
fragments. 
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13. A kit for sequencing DNA comprising the DNA polymerase of Claim 1 . 
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/i 

ATG GAA GCG 
M E A 
€1/21 

CAC CTG GCC 
H L A 
121/41 
GTG CAG GCG 

V Q A 
181/61 

AAG GCC GTC 
K A V 

241/81 
GCC TAC AAG 
A Y K 
301/101 
AAG GAG CTG 
K E L 
361/121 
GAC GTT CTC 
0 V L 
421/141 
ACC GCC GAC 
TAD 
481/161 
GGC CAC CTC 
G H L 
541/181 
GTG GAC TTC 

V D F 
601/201 
GGG GAG AAG 
G E K 
661/221 
AAC CTG GAC 
N L D 
721/241 
CTC AGG CTC 
L R L 
781/261 
GCC CAG GGG 
A Q G 
841/281 
GGC AGC CTC 
G S L 
901/301 
TGG CCC CCG 
WPP 
961/321 
GCG GAG CTT 
A E L 
1021/341 
TTG GCG GGG 
LAG 
1081/361 
TTG GCC TCG 
LAS 
1141/381 
CTC CTG GAC 
L L D 
1201/401 



ATG CTG 
M X* 

TAC CGC 

V R 

GTC TAC 

V Y 

TTC GTG 
F V 

GCG GGG 
A G 

GTG GAC 

V D 

GCC ACC 
A T 

CGC GAC 
R D 

ATC ACC 
I T 

CGC GCC 

R A. 

ACC GCC 
T A 

CGG GTA 

R V 

TCC TTG 
S L 

CGG GAG 
R E 

CTC CAC 
L H 

CCG GAA 
P E 

AAA GCC 
K A 

CTA AAG 
L K 

AGG GAG 
R E 

CCC TCC 
P S 



CCG CTG TTC GAA 
P L F E 

ACC TTC TTC GCC 
T F F A 

GGC TTC GCC AAG 
G F A K 

GTC TTT GAC GCC 
V F D A 

AGG GCC CCG ACC 
RAPT 

CTC CTG GGG TTT 
L L G F 

CTG GCC AAG AAG 
L A K K 
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